Low-rank approximation

Column span

The column span of a matrix 𝐗n×d\mathbf{X} \in \mathbb{R}^{n \times d} is the set of all vectors that can be written as 𝐗𝐚\mathbf{Xa} for some 𝐚d\mathbf{a} \in \mathbb{R}^d. The dimension of the column span DcD_c is the maximum number of linearly independent vectors in the column span.

Row span

The row span of a matrix 𝐗n×d\mathbf{X} \in \mathbb{R}^{n \times d} is the set of all vectors that can be written as $\mathbf{X}^{\sf T}\mathbf{b}$ for some 𝐛d\mathbf{b} \in \mathbb{R}^d. The dimension of the row span DrD_r is the maximum number of linearly independent vectors in the row span.

Rank

We have DcdDrnDc=DrD_c \leq d \qquad D_r \leq n \qquad D_c = D_r

We call the value Dc=DrD_c = D_r, the rank of 𝐗\mathbf{X}.

Low-rank approximation

Approximate 𝐗\mathbf{X} as the product of two rank-kk matrices.

Use two matrices 𝐂n×k\mathbf{C} \in \mathbb{R}^{n \times k} and 𝐁d×k\mathbf{B} \in \mathbb{R}^{d \times k}, where k<min(n,d)k < \min (n,d). Typically we want to choose 𝐂\mathbf{C} and 𝐁\mathbf{B} to minimize min𝐁,𝐂𝐗𝐂𝐁\min_{\mathbf{B},\mathbf{C}} \lVert \mathbf{X} - \mathbf{CB} \rVert for some matrix norm, e.g. Frobenius norm or square of as $\lVert \mathbf{X} - \mathbf{CW}^{\sf T} \rVert_F^2$

Without loss of generality can assume right matrix is orthogonal, i.e. $\mathbf{W}^{\sf T}$ with $\mathbf{W}^{\sf T}\mathbf{W} = \mathbf{I}$.

#incomplete


References:

  1. https://www.chrismusco.com/amlds2023/notes/lecture11.html#Low-Rank_Approximation